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ABSTRACT 

We discuss methods currently in use for determining the significance of peaks in the peri- 
odograms of time series. We discuss some general methods for constructing significance tests, 
false alarm probability functions, and the role played in these by independent random variables 
and by empirical and theoretical cumulative distribution functions. We also discuss the concept 
of "independent frequencies" in periodogram analysis. We propose a practical method for es- 
timating the significance of periodogram peaks, applicable to all time series irrespective of the 
spacing of the data. This method, based on Monte Carlo simulations, produces significance tests 
that are tailor-made for any given astronomical time series. 

Subject headings: Methods: data analysis — Methods: statistical — Stars: oscillations 



1. INTRODUCTION 

Periodogram analysis is a vital ingredient of as- 
teroseismology. It is used to identify periodicities 
of oscillations in the observed star. Typically, the 
data analysed are noisy. The effect of noise in 
the data is to produce spurious peaks in the pe- 
riodogram which arise, not because of any peri- 
odicity in the observed system, but because of the 
way that the noisy signal has been sampled. These 
spurious peaks can be surprisingly large. It is es- 
sential therefore to have reliable tests by which to 
determine the significance of periodogram peaks. 

This topic has already received attention in 
th e literature. Ke y classical pap e rs include thos e 
of iDeemind (Il975l). iLombl (Il976lh IScargld (|l982l h 
and Home k, Baliunad ( 19861 ). We discuss perti- 
nent aspects of these papers in the sections that 



follow. More recently, criticisms of these pa - 
pers have appeared in the work of iKoenl (|l990l) 
and Schwarzenberg-Czerny (1998), amongst oth- 
ers. Much of the criticism has revolved around 
the appropriate means for attaching significance 
to peaks that arise in a calculated periodogram. 

Significance tests for periodograms are hugely 
important to the asteroseismologist who relies on 
periodograms to deliver precise values for pur- 
ported eigenfrequencies of pulsation. Comparison 
of the values of observationally determined eigen- 
frequencies with the values predicted by the lat- 
est theoretical models should, in principle, allow 
the identification of modes actually excited in real 
stars, and, subsequently, allow for asteroseismo- 
logical analysis of those stars. 

Asteroseismology appears to be on the thresh- 
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old of a golden ag e, as extensive surveys like ASAS 
( Pojmanski 1998f). and space mi ssions in the mold 
of COROT (jBaglin et al.lEjol . hugely increase 
the number of known pulsating stars, as well as the 
time coverage available for their analysis. It is ex- 
pected that periodogram analysis will continue to 
play a prominent role in asteroseismology. Hence, 
accurate interpretation of periodogram peaks is an 
issue of prime importance. 

In this paper, we consider methods currently 
used for assessing the relevance of periodogram 
peaks, and propose a practical method applicable 
to all time series, irrespective of the spacing of 
the data. This method produces significance tests 
that are tailor-made for any given astronomical 
time-series. 

The structure of this paper is as follows. 
We first discuss the construction of significance 
tests in general, and of Scargle's significance 
test in particular. We consider the concept of 
"independent frequencies" in periodogram anal- 
ysis. We c omment on a sp ects of the work 
report ed by Scarglel ( 1982 ), Horne fc Baliunasl 
(|l986l ). and ISchwarzenberg-Czernvl (|l998h . We 
report our attemp t s at re producing the results of 
Horne k. Baliunasl (|l986t) by Monte Carlo simu- 
lation, and discuss our failure to reproduce their 
results in detail. The conclusions forced on us by 
the discrepancies between our results and theirs 
lead us to the main points made in this paper. 
They also lead us to propose a pragmatic method, 
applicable to all time series, for assessing the sig- 
nificance of periodogram peaks. En route, we also 
discuss the problem of over-sampling the peri- 
odogram. 

Definitions of the periodogram assumed in our 
discussion are given in Appendices lAl and IB1 of this 
paper. Detailed discussions of the phenomena of 
aliasing and spectral leakag e, to which we re fer in 
the tex t , may be found in iDeemingl (|1975h and 
Scarglel (Il982h . 



2. SIGNIFICANCE TESTS 

Noisy data produce noisy periodograms. Peaks 
in a periodogram may therefore not be due to the 
presence of any real periodic phenomenon at all. 
They may simply be random fluctuations in peri- 
odogram power caused by the presence of a noise 
component in the data. Peaks arising in this way 



are spurious: they are not due to any real peri- 
odicity in the observed phenomena, but are sim- 
ply artifacts of chance events in the accompanying 
noise. 

Simulations show that noise in a time series can 
produce surprisingly large spurious peaks in the 
associated periodogram. It is important therefore 
to develop reliable tests for determining whether a 
given periodogram peak reflects a real periodicity 
in the data, or is simply an artifact of the noise. In 
this section, we consider the theoretical basis for a 
class of general, model- independent, tests. These 
determine the probability that the periodogram 
powers observed in a data set might have arisen 
from pure noise alone, with no other form of sig- 
nal present. For a definition or pure noise, see 
Appendix [UJ 

Note that, in this paper, we do not use the word 
"power" in the formal statistical sense, where it 
means the probability of rejection of the null hy- 
pothesis given that the null hypothesis is false, but 
in its accepted physical sense. Thus "periodogram 
power" at frequency uj means Px(w), as defined in 
Appendices and [B] 

The basis for this general class of tests is the 
cumulative distribution function (CDF), 



F z {z) =Pr[Z < z] 



(1) 



where, the random variable Z = Px {oj) is the 
periodogram power at frequency u> for the time 
series A, and z is some selected power thresh- 
old. The function Fz{z) gives the probability that, 
when the data X are pure noise, their periodogram 
power at the given frequency u> does not rise above 
power-level z. 

Suppose a model of the observed system pre- 
dicts an oscillation at frequency ui. Then, we ex- 
pect Px(v) to be large at this frequency. However, 
pure noise by itself might equally well produce a 
large value of Px(v). The CDF in equation (fT]) 
provides an objective criterion by which to deter- 
mine whether the observed large value of Px(u) 
is due to the presence of a bona fide signal, or is 
nothing more than a spurious large fluctuation due 
only to the presence of noise. Suppose the data X 
are pure noise. Then the probability that the nor- 
malised periodogram power at this frequency is 
less than a specified value zq is 



Po = F z (z ) 



(2) 
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Inverting this function, 

z = F z - 1 (p ) (3) 

we obtain, for given p , the threshold power-level 
zo for which a power value Z < zq has a probabil- 
ity po of being due to pure noise alone. Equiva- 
lently, a power value at frequency u> that exceeds 
zo has probability 1 — po of being due to pure noise 
alone. This test is both primitive and negative. It 
does not tell us that po is the probability that our 
signal contains a periodic component of frequency 
u>, but only that po is the probability that our sig- 
nal is not pure noise. 

In practice, one does not evaluate the peri- 
odogram power at a single frequency only, but at 
a selected set {ui^ : fi = 1, 2, ...,N} of frequencies. 
The procedure normally followed when looking for 
periodicities in data is this: the periodogram is 
evaluated at the selected frequencies w^, and the 
periodogram power Px(^v) is plotted against 
this plot is then scanned for its highest peaks. The 
conclusion one would like to draw from the plot 
is that peaks that rise substantially above all oth- 
ers indicate the presence of genuine periodicities in 
the observed system. However, before we can have 
confidence in this conclusion, wc need first to rule 
out the possibility that the observed periodogram 
plot could have been produced by pure noise alone. 
This is done by calculating the probability that the 
entire observed periodogram profile could be pro- 
duced by pure noise alone. Suppose X is pure 
noise. Consider the probability that all of the pe- 
riodogram powers {Px(u)p) : [i = 1,2,...,N} at 
the sampled frequencies fall below a specified 
power threshold z. Define a new random variable, 

= sup {Px{oJp) ■ H = 1, 2, N} (4) 

Thus Z max is the maximum periodogram power 
among the set of N sampled powers. Now, the 
power at each of the sampled values will fall below 
some specified threshold z if and only if Z max < z. 
We thus need to calculate the CDF 

Fz mn {z) = Pr[Z nux <z] (5) 

The function i 7 !z max (z) gives the probability that, 
when the data X are pure noise, the periodogram 
power Px{oJ^) does not rise above the threshold z 
at any of the sampled frequencies {w^}. We con- 
struct the second significance test as follows. Let 



Zq be a specified power threshold. The probabil- 
ity that pure noise alone will produce periodogram 
powers Px{wn) that do not exceed the threshold 
zo at any of the sampled frequencies is given 

by 

Po = F Zaiax (z ) (6) 

Inverting this function, 

zo = F z _- 1 ( Po ) (7) 

For given po, this inverse function defines a thresh- 
old power- level zq such that, if the periodogram 
power at each of the frequencies {w M } has value 
Z < zo, then the observed periodogram profile has 
probability p n of being due to pure noise alone. 
This test reduces the probability of spurious de- 
tections. 

3. SCARGLE'S SIGNIFICANCE TEST 

If the data are Gaussian pure noise, the peri- 
odogram power Z = Px (u>) at any given frequency 
uj of the sampled signal Xk is exponentially dis- 
tributed with probability density function defined 
by (Scargle 1982, p 848), 

Pz(z) dz = Pr[z < Z < z + dz] 

= 4- e~ z/a2x dz (8) 

The cumulative distribution function is thus given 

by 

P z (z) =Pt[Z < z] 

= f Pz(C) d(=l-e-^ (9) 
JC=o 

We are interested in the probability that the pe- 
riodogram power at the given frequency is greater 
than a specified threshold z. This is given by 

Vx\Z > z] = l- P z {z) = e- Z l a x (10) 

As the observed power z becomes larger, it be- 
comes exponentially less likely that so high a 
power level (or higher) could be produced by pure 
noise alone, and correspondingly more likely that 
the observed power level is due to a genuine deter- 
ministic (i.e., non- noise) feature in the measured 
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signal. Of course, this does not mean necessar- 
ily that the suspected deterministic signal is har- 
monic with frequency w, but simply that it is un- 
likely that this high power is due to the noise com- 
ponent alone. 

It is worth noting that the argument of the 
exponential in the cumulative distribution func- 
tion is not simply the observed power z, but 
the ratio z/a x , which is the ratio of the peri- 
odogram power to the total variance of the data 
(called total input signal power by some). This 
is an important point, w orth emphasising, as did 
iHorne k. Baliunaa ( 19861 ). If the incorrect power 
ratio is used, then the statistical tests considered 
by Scargle will necessarily fail. Thus, normalisa- 
tion of the periodogram power by the number Nq 
of data points used to calculate the periodogram 
(classical normalisation), or by the residual power 
after a sine curve has been removed from the data, 
or by the variance of the observational uncertainty, 
all lead to completely different statistical distri- 
butions for the periodogram power and invalidate 
Scargle's analysis summarised in this paper. Of 
course, this does not make alternative normalisa- 
tions "wrong" . It does mean however that they 
must be accompanied by alternative statistical 
analyses (|Schwarzenberg-Czernv 19981) . 

In practice, we do not evaluate the periodogram 
power at a single frequency alone, but at a set 
of conveniently chosen frequencies {lj^ : fi = 
1, 2, N}. We shall return to the question of how 
to choose these frequencies in a later section. For 
the moment, suppose that we have the values of 
Px not at one value of the frequency alone, but 
over a set of frequencies. This enables us to devise 
a stronger test in which we determine the probabil- 
ity that the observed periodogram power over the 
entire set of sampled frequencies have been pro- 
duced by pure noise alone. 

To develop this new, stronger statistical test, 
we need to assume with Scargle that we have eval- 
uated the periodogram power at a set {oj^ : ji = 
1,2, ...,iVj} of frequencies chosen in such a way 
that the random variables {Z^ = P x(uj,,,) : u 



1, 2, Ni} are mutually independent. Home fc Baliunaa 
( 1986f ) refer to a set of frequencies chosen in this 
way as "independent frequencies" . This is an 
abuse of terminology, since it is not the frequencies 
that are "independent" , but the random variables 
Zn. However, this lack of precision leads to no 



ambiguity and so is tolerable. 

A large body of theorems is available for use 
if the random variables under consideration are 
independent. Abandoning the condition of inde- 
pendence creates serious complications in both the 
reasoning and the proofs of the results. 

Suppose we observe a periodogram power at 
one of the w^, that is higher than a given thresh- 
old z. We ask, what is the probability that pure 
noise alone could have produced a periodogram 
power of this level or higher among all of the sam- 
pled independent periodogram frequencies? First, 
we calculate the probability that all the sampled 
periodogram powers arc less than the threshold 
power z. Define 

Zmax = SUp {Zi, Z 2 , Zn^ 

The probability that any given power Z^ in this 
set falls below the threshold is 



Pr [Z fl <z] = l- e 



-z/cr x 



Since the Z^ are independent, the probability that 
they all fall below the threshold z is given by 

Pr \Z\ < z and Z 2 < z and ... and Zpj i < z] 

= Pr [Zi < z] Pr [Z 2 < z] ... Pr [Z N% < z] 



1 - e~ z/CT ^ 



The probability that not all the powers Z^ are less 
than the threshold z, that is, the probability that 
at least one of the powers Z^ is above the threshold 
z, is then, 



Pr [Z B 



> z 



= 1 - 



1 



Ni 



(11) 



This is the function that Scargle proposes as a false 
alarm probability. The idea is that we choose a 
probability, say pa, that we regard as an accept- 
able level of risk for the false detection of real de- 
terministic signals. We solve the above formula 
for z, to get a reference power threshold level za 
given by 



ZA 



hi 



i-(i-pa) 



1/JVi 



(12) 



Then, if we claim a detection whenever the 
power level at one of the frequencies {uj^ : /i = 
1, 2, ...,iVj} exceeds the reference level za, the 
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probability that we will be wrong is given by pa- 
That is, on average we will be wrong only pa of 
the time, since pure noise can produce fluctuations 
above this level at these frequencies only pa of the 
time. 

4. INDEPENDENT FREQUENCIES 

Scargle's test is constructed on the assumption 
that we can identify a set of frequencies at which 
the periodogram powers are independent random 
variables. In the case where the time-domain data 
are evenly spaced, we are guaranteed the existence 
of such a set. These are called the natural fre- 
quencies (Scargle 1982J), or the standard frequen- 
cies (|Priestlevlll98lh . These are given by 



2nk 
~~T~ 



(13) 



where T is the total time span of the data set, that 
is, T = t No -ti, andfc = 0, [N /2], where [N /2] 
signifies the integer part of Nq/2. The statistics 
of Pxiojk) with k = are different from those 
with k ^ (|Priestlevlll98lh . If we omit P x (oj ), 
this leaves us with at most [-/Vq/2] independent 
frequencies. In practice, the omission of luq from 
the set of independent frequencies is of no conse- 
quence. This frequency corresponds to a DC com- 
ponent in the signal which is generally removed 
from the data before their periodogram is calcu- 
lated. Thus, in the case of evenly spaced data, we 
can easily construct the Scargle false alarm proba- 
bility function and apply it to determine the signif- 
icance of high periodogram-power levels at these 
"independent frequencies" . 

It is worth emphasising that, since the false 
alarm probability function assumes independent 
powers at the examined frequencies, we can only 
use it to put a significance level on the values of 
the periodogram-power at the chosen independent 
frequencies. Peaks found at other frequency val- 
ues by over-sampling the periodogram cannot be 
assessed in this way. 

In the unevenly sampled case, the situation 
changes dramatically. The statistical analysis 
of the classical periodogram becomes intractable. 
The results are sampling-grid dependent, and no 
general analysis applicable to all cases has yet 
been produced. To simplify the statistical analy- 
sis, Scargle proposed that the definition of the pe- 
riodogram be modified. His modified periodogram 



had al ready been used by Barnind ( 19631 ) , Vanicek 
(|l969l) . and|LombJ <|l976h - These authors did not 
view the modified periodogram as an attempt to 
estimate the Fourier power spectrum from un- 
evenly sampled data, but as a spectral method 
for searching for the best-fit harmonic function 
to their data. The novelty of Scargle's approach 
was that he generated the same spectral method 
as used by these authors by imposing simple con- 
straints on a generalised form of the Fourier trans- 
form. The constraints were that the modified pe- 
riodogram should mimic as closely as possible the 
statistical properties of the classical periodogram, 
and that the resulting spectral function should be 
insensitive to time translations of the data in the 
time domain. 

The demand that the modified periodogram 
should mimic as closely as possible the statisti- 
cal properties of the classical periodogram was 
only partially successful. Forcing time translation 
invariance, and demanding that the statistics of 
the random variable Px(w) at a single selected 
frequency remain unchanged, that is, demanding 
that Px(tu) be exponentially distributed, exhausts 
the free parameters in Scargle's modified FT, giv- 
ing Lomb's spectral formula. In this way, he re- 
produced some properties of the periodogram for 
the evenly sampled case. However, this is the best 
that he could do. Most other familiar properties 
are lost. The most important loss is the existence 
of independent frequencies. 

All relevant information about correlation 
and mutual dependence of the random variables 
{Px(^)} is contained in the window function, 
G(ll>). (For a discussion of the window func- 



tion, see Scargle! ( 19821 ). Appendix D, p 850, and 
also his discussion on p 840.) Thus, the coef- 
ficient of linear correlation be tween Px(oj ) and 
Pjc(w') is given by G(u' - to) (|Lomblll976fh For 
independence of Px(^) and Px(o/), it is nec- 
essary (but not sufficient) that G(ui' — u>) =0. 
Furthermore, for mutual independence of a set 
{Px{^>k) ■ k — 1,2, ...,r} of periodogram powers, 
it would also be necessary (but not sufficient) to 
have the Uk evenly spaced. These are very diffi- 
cult conditions to realise in practice. iKoenl (|199Q|) 
has searched numerically for such mutually uncor- 
rected sets in a variety of sampling schemes and 
failed to turn up more than two simultaneously 
uncorrelated frequencies. 
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For Scargle, this loss of independent frequen- 
cies is not debilitating. He says (p 840, column 1) 
that "... if the frequency grid is well chosen, the 
degree of dependence between the powers at the 
different frequencies is usually small", and (p 840, 
column 2) that, "With a wide variety of sampling 
schemes G(ui) does have nulls, or relatively small 
minima, that are approximately evenly spaced... 
Such nulls comprise a set of natural frequencies 
at which to evaluate the periodogram. At these 
frequencies the P(to) form a set of approximately 
independent random variables - thus closely simu- 
lating the situation with evenly spaced data" . The 
implication, though not explicitly stated by Scar- 
gle, is that in spite of the loss of independence 
of the random variables P{lo) at the natural fre- 
quencies, the false alarm proba bility giv e n by our 
equation (TTTj) (equation (14) in IScarglel (1982), p 
839), still provides a reliable significance test in 
the wide variety of sampling schemes that he con- 
sidered. 

It seems that Scargle's recommendation for the 
case of unevenly spaced data is as follows: evaluate 
the modified periodogram at the natural frequen- 
cies defined by the given data span, and use the 
false alarm probability calculated for the evenly 
spaced case to evaluate the significance of the pe- 
riodogram peaks. He further recommends that, 
to improve the detection efficiency, we decrease 
the number of frequencies inspected (p 842). The 
effect of this reduction is that we reduce power 
threshold for a given significance level of peak- 
heights. 

The value of Ni is a critical ingredient in Scar- 
gle's false alarm probability function. There has 
been some debate con cerning its correct value , 
as well as its meaning. iHorne fc Baliunad £l986) 
appear to have been unsatisfied with the value 
Ni = [iVo/2] and proposed to determine Ni by a 
method which we describe in the following section. 

5. HORNE AND BALIUNAS DETER- 
MINATION OF Ni 



Ho me fc Baliunasl ljl98fih . (HB in the remain- 



der of this paper), determined Ni by the follow- 
ing procedure. They simulated a large number 
of data sets, each consisting of pseudo-Gaussian 
noise. The periodogram of each data set was eval- 
uated from u) = 2n/T to oj = itNq/T, where T is 



the total time interval. They then chose the high- 
est peak in each periodogram, combined these, and 
fitted the Scargle false alarm probability function 
to the peak distribution using Nj as the variable 
parameter. 

The HB simulations investigated three major 
types of spacing in the time coordinate. In the 
first, the data were evenly spaced in time. In the 
second, each time followed the previous one by a 
random number between and 1. In the third, 
the data were clumped in groups of three at each 
evenly spaced time interval. 

In the case where the data are evenly spaced 
in time, theoretical statistical analysis provides us 
with a very clear, unambiguous picture of what to 
expect from the simulations: the random variables 
{Px(uk) ■ k = l,...,[JV /2]}, where w fc = 27rfc/T 
and T is the total time interval covered by the 
data, are mutually independent; the window func- 
tion, which contains all relevant information about 
dependencies and correlations of the random vari- 
ables Px*(w), shows that these are the only fre- 
quencies at which the periodogram powers are in- 
dependent (Scargle p 840 and 843); the listed fre- 
quencies uik contain maximal information about 
the power distribution of the sampled signal. This 
is seen from the fact that the discrete Fourier 
transform evaluated at these frequencies contains 
exactly enough information to reconstruct com- 
pletely the original data. So, from theory, we 
expect the total number Ni of independent fre- 
quencies in the case of evenly spaced time series 
consisting of zero mean pure noise to be exactly 
[N /2]. In practice, a simulated time series, gener- 
ated from a zero mean distribution, will not have 
precisely zero mean. We must therefore remove its 
mean before finding its periodogram. Once this is 
done, the theory guarantees that our simulated 
data set will have exactly [Nq/2] independent fre- 
quencies. The point is this: for the evenly spaced 
data sets that we have simulated, the number of 
independent frequencies in the periodogram is at 
most [iVo/2]. In real data, this number may need 
to be further reduced if we estimate other param- 
eters. 

Surprisingly, the best fits obtained by HB con- 
sistently produced values of Ni which were sub- 
stantially higher than this expected upper limit 
(HB, Table 1, p 759). In fact, their fitted values 
are consistently higher than Nq, with the excep- 
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tion of their two smallest data sets (10 and 15 
points respectively) where the fitted value of JV< is 
slightly less than iVo, but still about twice as large 
as expected. 

These results are puzzling. Theory and simula- 
tions a ppear to be in conflict. ICumming. Marcv &: 1 
note that Baliunas has indicated typo- 
raph ical err ors in the values listed in HB.|Koen| 
1990) and Schwarzenberg-Czernv ( 19961 ) have 
also noted mistakes in HB. We have repeated the 
HB simulations for the case of even sampling in the 
time domain. We have also extended somewhat 
the scope of their investigations to consider the 
alternati ve false alarm probabil i ty fu nction pro- 
posed by ISchwarzenberg-Czernvl (| 19981 ) , as well as 
the effects of over-sampling the periodogram. The 
results are interesting, and we report them in the 
corresponding sections below. 

In our first set of simulations, we attempted to 
reproduce the results reported by HB in their Ta- 
ble 1, p 759, for the case of evenly spaced data. HB 
describe the method they followed in their simu- 
lations as follows: "The periodogram of each data 
set was evaluated from ui = 2n/T to w — ttNq/T 
... The highest peak was then chosen in each pe- 
riodogram." It was not clear to us whether they 
sampled the periodogram values Px(oj) only at 
the natural frequencies LOk = 2i:k/T 1 and then 
chose the highest periodogram power from this re- 
stricted sampled set, as prescribed by Scargle; or 
whether they followed the practice of a not insub- 
stantial number of astronomers who search for the 
highest periodogram peak in the given range by 
grossly over-sampling the periodogram, and then 
choose the maximum value obtained irrespective 
of whether it occurs at one of the natural frequen- 
cies u>k- Accordingly, we ran two sets of simu- 
lations implementing both procedures. We fitted 
the Scargle false alarm function to our results by 
the method of least squares. All our pcriodograms 
were normalised using the sample variance of the 
simulated data, and not the variance of the distri- 
bution used to generate the sample. We failed to 
reproduce the HB results in detail. Sampling the 
periodogram at the natural frequencies only and 
choosing the highest value among these yielded 
values of Ni that were consistently lower than 
those obtained by HB. In fact, we obtained val- 
ues very close to [N /2], as expected theoretically, 
but in conflict with the results published by Home 



and Baliunas. Searching for the highest peak by 
over-sampling also yielded values that were consis- 
tently lower than HB, but higher than sampling at 
the natural frequencies. More precisely, our results 
agree closely with those of HB for the smaller data 
sets up to 170 data points. This leads us to suspect 
that the HB table was constructed by gross over- 
sampling. However, our results strongly deviate 
from theirs for the larger data sets with No > 170, 
with our values being substantially lower. Plotting 
Ni vs. N (Figure [J), we observe the following 
features. The values yielded by our simulations 
increase linearly with Nq, as expected. In con- 
trast, the results published by HB in their Table 
1 appear to lie, not on a quadratic (as claimed by 
them), but on two straight lines of different slope, 
a sharp change in slope appearing for data sets 
with No > 170. This seems to be indicative of a 
systematic error. Fitting a quadratic function to 
these data points, as was done by HB, may there- 
fore be misleading and renders suspect its use in 
estimating the parameter JVj. 

Note however that, in the case of over-sampling, 
both our results and those of HB consistently yield 
values of N{ that are higher than the theoretically 
expected value of [No/ 2]. These values are thus 
apparently in conflict with the theory. The inter- 
pretation of Ni as the number of independent fre- 
quencies is therefore questionable in this context. 
The HB method for determining Ni is eminently 
practical and reasonable, but it only yields cor- 
rect values when the periodogram is sampled at the 
natural frequencies. This means that, in the con- 
text of over-sampling, we cannot assign to the pa- 
rameter Ni the meaning that it had in its original 
derivation, namely the number of independent fre- 
quencies in the associated periodogram. Rather, 
we must treat Ni as nothing more than a floating 
parameter in a one-parameter family of candidate 
CDF functions which we are attempting to fit to 
our data. 

Another problem with the HB method should 
be noted. Inspection of a plot of the best-fit Scar- 
gle false alarm probability function shows it to 
be a very uncomfortable fit to the experimentally 
obtained cumulative distributions of periodogram 
peak heights (see Figure [2]) . This is true both in 
the case of sampling at the natural frequencies and 
of over-sampling. Its general trend is good: it is 
flat near value 1 at low peak heights, drops rapidly 
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Fig. 1. — Plots of Nj vs. No of the data pub- 



lished by iHorne fc Baliunasl (1986) in their Ta- 
ble 1, p 759, for the case of evenly spaced data, 
and of our simulations, fitting the Scargle and 
Schwarzenberg-Czerny false alarm functions to the 
empirical CDF's obtained by sampling at the nat- 
ural frequencies and by over-sampling. Solid dots: 
published Home and Baliunas values; asterisks: 
Scargle function fitted to empirical CDF's ob- 
tained by sampling at the natural frequencies; cir- 
cled crosses: Schwarzenberg-Czerny function fit- 
ted to the same; stars: Scargle function fitted to 
empirical CDF's obtained by over-sampling; trian- 
gles: Schwarzenberg-Czerny function fitted to the 
same. The solid line is the theoretically expected 
relationship N t = [N /2\. 



over the peak height mid-range, and levels off to 
zero for larger peak heights. However, its detailed 
behaviour simply does not match that of the ex- 
perimental curve. It drops too quickly, and levels 
off too soon. This mismatch is most pronounced 
for small data sets, and becomes progressively less 
noticeable as the data sets increase in size. But it 
never vanishes completely. The conclusion forced 
on us by our simulations is that the Scargle false 
alarm probability function fails to reproduce the 
detailed behaviour of the simulated data sets. This 
is both good news and bad news: good news be- 
cause it shows that the Scargle function underesti- 
mates the significance of periodogram peaks; and 
bad news because it leaves us without an useable 
false alarm probability function. 

In summary, we cannot in general regard Ni 
as anything more than a fitting parameter. Fur- 
thermore, the Scargle probability function incor- 
rectly describes the statistical behaviour of the pe- 
riodogram in these simulations. We discuss a pos- 
sible reason for its failure in a later section. For the 
moment, we simply note that it manifestly fails to 
provide a convincing fit to the empirical CDF pro- 
duced by our simulations. It displays the correct 
general characteristics of a CDF but, notoriously, 
all CDF's tend to look alike, so simply displaying 
correct general features is not a point in its favour. 
Our conclusion therefore is that the HB method is 
not in general a way to assess the number of inde- 
pendent frequencies in a periodogram. Rather, it is 
a method for estimating the best-fit parameter Ni 
in an ill-fitting class of candidate CDF functions. 

This does not make the Scargle probability 
function or the HB method for estimating Ni 
worthless. In those cases (large data sets) where 
the Scargle function gives a reasonable fit to the 
empirical data, the HB method provides a value 
of Ni that makes the Scargle function a good es- 
timate of the correct false alarm probability and 
provides a formula in closed form that can be used 
as a significance test. However, note that this for- 
mula consistently underestimates the significance 
of periodogram peaks, this underestimation be- 
coming increasingly severe as the data sets become 
smaller. 
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Table 1: Results of Monte Carlo simulations 



Over-sampled Natural Frequencies 



J V 


HR Value 

-L-L-LJ V CXI LL L 


Nnmhpr 

i. i LXXXXL7 V^X 


Slpjxrclp T*imptinn 

k_J v_' CXX LI XL^ X_ LX XX\_' IX w X X 


SC Function 

O \J -L LLXX^LjXWXX 


Slpflvclp T^nnptinn 

kJ ^ CX L LL X \L' A LI X X 1^ L X w X X 


kj V/ X- LLXXL^ L XWXX 




of Ni 


of Tests 


Best-fit iVj 


Best Fit Ni 


Best Fit Ni 


Best Fit iV, 


10.00 


9.70 


1395.00 


9.09 


26.80 


5.00 


8.70 


15.00 


14.45 


347.00 


14.09 


32.80 


7.90 


13.60 


25.00 


27.38 


213.00 


24.91 


47.80 


12.80 


20.00 


35.00 


38.40 


214.00 


35.64 


60.40 


18.50 


26.50 


50.00 


54.45 


369.00 


54.00 


84.00 


25.70 


34.40 


64.00 


71.76 


512.00 


70.45 


102.80 


33.00 


42.40 


75.00 


86.05 


153.00 


85.82 


121.90 


39.30 


50.10 


100.00 


119.58 


296.00 


113.91 


152.10 


51.60 


62.80 


128.00 


152.53 


913.00 


149.09 


191.80 


65.50 


77.70 


170.00 


218.33 


218.00 


210.09 


261.40 


89.30 


103.00 


256.00 


369.97 


224.00 


306.36 


361.20 


128.40 


143.00 


300.00 


455.95 


107.00 


361.45 


420.20 


148.40 


163.30 


400.00 


618.69 


106.00 


477.18 


540.10 


204.30 


221.60 



Note. — Comparison of Home and Baliunas values of TV; with the results of our numerical simulations, fitting both Scargle and 
Schwarzenbcrg-Czerny (SC) false alarm functions to CDF's constructed from over-sampled periodograms and from periodograms 
sampled at the natural frequencies. The corresponding best-fit functions are displayed in Figures [2] and [3] together with the 
corresponding functions constructed with the correct value of TV; = [Nq/2]. 



THE SCHWARZENBERG-CZERNY 
FALSE ALARM FUNCTION 



Koenl <| 19901 ) pointed out an important implicit 



assumption in Scargle's derivation of his false 
alarm probability function. Scargle assumed that 
the variance a\ of the data Xk is known a priori. 
There are situations in which this condition is true, 
but it is satisfied neither in the case of real astro- 
nomical data nor in that of the HB simulations. 
In the simulations, pseudo-data are generated us- 
ing a preselected variance and mean (chosen to be 
zero) , but the variance and mean of the generated 
sample will differ in general from those used in 
their generation. Thus both variance and mean 
need to be estimated from the data. 

This changes the statistical ana lysis signifi- 
cantly. ISchwarzenberg-Czerny ( 19981 ). in a partic- 
ularly clear and thorough exposition of the issues 
involved, has shown that the CDF of maximum 
peak heights appropriate to the Lomb-Scargle pe- 
riodogram and calculated from a finite sample of 
Gaussian pure noise, is the (regularised) incom- 



plete beta function 



Ii- z/[No/2] ([N /2],l) = (l-r^ 



/2] 



[No/2] 



(14) 

To construct the corresponding false alarm proba- 
bility function, we need to use this distribution in 
place of the exponential distribution used above. 
If in our periodogram we can identify a set of 
frequencies at which the periodogram powers are 
mutually independent, then the probability that 
the power at at least one of these frequencies rises 
above given threshold power z is given by 



Pr [Z n 



> z] = 1 - 



1-1- 



*/<?x \ 
Wo/2] J 



[No/2] 



Ni 



(15) 

where Ni is the number of mutually independent 



frequencies inspected, and Z n 



sup {Z 1 ,Z 2l .- 1 Z Ni } 



is the maximum power among the mutually inde- 
pendent powers Z^, In our discussion, we shall 
call equation Q15p the Schwarzenberg-Czerny false 
alarm probability funct i on. I n passing, note that 
Schwarzenberg-Czernvl (1998) provides a number 
of alternative distributions and test statistics ap- 
propriate to other methods of data analysis and 
is able thereby to resolve extant disputes about 
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Fig. 2. — Empirical CDF's (heavy dotted line) 
constructed by over-sampling the periodogram, 
with best-fitting Scargle false alarm probability 
function (solid line) for (a) N = 10, (b) N = 50, 
and (c) No — 75 data points. The fit improves 
with increasing No. The corresponding best- fit 
values of N are (a) N = 9.09, (b) N t = 54.00, 
and (c) N = 85.82. The light dashed line shows 
the Scargle function for N = [JVq/2]. In all cases 
the best-fit value of N{ exceeds [No/ 2], 



the "correct" normalisation procedure for peri- 
odograms. 

In the limit No — * oo, the distribution in 
equation (| 14[) becomes exponential and coincides 
with that used by Scargle. Accordingly, in the 
same limit, the associated false alarm probabil- 
ity function in equation (fl"5|) reduces to the Scar- 
gle false alarm function. A Q — Q plot of the 
Schwarzenber g-Czerny vs. Scargle false alarm 
functions (see lSchwarzenberg-Czernv ( 19981 ). Fig- 
ure 1, p 835) shows that, while the agreement be- 
tween them is good for large No, they differ sub- 
stantially for small data sets, with Schwarzenberg- 
Czerny's false alarm function yielding consistently 
smaller false alarm probabilities than Scargle's. 
According to this analysis, therefore, for given 
Ni, the Scargle false alarm function consistently 
underestimates the statistical significance of peri- 
odogram peaks. 

One reason for the failure of the Scargle func- 
tion to reproduce the behaviour of our empiri- 
cal CDF's may be its implicit assumption that 
the variance o~\ is known a priori. To cor- 
rect this error, we replaced the Scargle func- 
tion by Schwarzenberg-Czerny's and repeated the 
HB simulations for equally spaced data. Using 
their method for determining Ni, we fitted the 
Schwarzenberg-Czerny false alarm function to our 
empirical CDF's. We found very good, but not 
perfect, agreement between the best- fit theoreti- 
cal curves and the corresponding empirical ones, 
with the greatest deviations occurring for small 
data sets (See Figure |3|). For these, the theoret- 
ical best-fit curves consistently yield values that 
are lower than those of the empirical curves, thus 
overestimating the significance of peaks. For the 
larger values of No, the deviations of the fitted 
from the empirical curves may be understood in 
the context of order statistics. 

In spite of the excellent nature of these fits, 
there is nevertheless an interesting feature in 
these results that is worth noting. For the CDF's 
of periodogram powers sampled at the natural 
frequencies, the best-fit values of Ni are consis- 
tently larger than the theoretically expected num- 
ber of independent frequencies, which is at most 
[iVo/2] (see Table 1). Correspondingly, a plot of 
the Schwarzenberg-Czerny function for the value 
[iVo/2] of independent frequencies yields a curve 
that deviates badly from the corresponding em- 
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Fig. 3. — Empirical CDF's (heavy dotted 
line) constructed by over-sampling the peri- 
odogram, with best-fitting Schwarzenberg-Czerny 
false alarm probability function for (a) No = 10, 
(b) iV = 50, and (c) N = 75 data points. 
The fits are significantly better than the corre- 
sponding ones for Scargle's function. However, 
for low No, Schwarzenberg-Czerny 's distribution 
is still significantly different from the empirical 
one and overestimates the significance of high 
peaks. The light dashed line shows the corre- 
sponding Schwarzenberg-Czerny false alarm func- 
tion for N — [No/2]. In all cases, the best-fit 
value of Ni again exceeds [No/ 2]. 



pirical CDF and which leads to a severe overesti- 
mation of the significance of periodogram peaks. 
We have no option but to conclude from these 
results that, like the Scargle false alarm function, 
the Schwarzenberg-Czerny false alarm function, 
given by equation (fT"S"|) , appears not to describe 
the CDF's of our simulations. Note also from 
Table 1 that the best-fit values of N for CDF's 
constructed from over-sampled periodograms are 
higher than those for the CDF's obtained by sam- 
pling at the natural frequencies. This is consis- 
tent with our previous results for the Scargle false 
alarm function. The results of our simulations 
again appear to be at variance with the theory. 
For evenly spaced data, the theory (which seems 
unassailable) predicts unambiguously the exis- 
tence of at most [iVo/2] independent frequencies, 
with the CDF for periodogram powers sampled 
at these frequencies given by the Schwarzenberg- 
Czerny false alarm function. Our empirical CDF's 
differ substantially from those predicted by this 
theory, with HB best-fits occurring at values of 
N that are consistently higher than expected. 
These results force us to the following conclusions. 
First, when using the HB method to estimate Ni, 
we cannot interpret the best-fit value of Ni as the 
number of independent frequencies in our peri- 
odogram. Rather, we must treat Ni as a floating 
parameter in a one-parameter family of candi- 
date CDF functions. This conclusion is consistent 
with that stated in the previous section. Second, 
as candidate CDF functions, the Schwarzenberg- 
Czerny false alarm function appears to be superior 
to Scargle's. 

7. FALSE ALARM FUNCTIONS FOR 
UNEVENLY SPACED DATA 

The principal difficulty encountered when 
searching for a false alarm function in the case 
of unevenly spaced data is the loss of the so-called 
independent frequencies. This loss is not appar- 
ent. It is real. The problem is not that they are 
difficult to identify but nevertheless present. It is 
that they are not there at all, except perhaps for 
a small set that can be counted on the fingers of 
one hand. A significance test based on so small a 
number of independent frequencies is not useful. 
It would require us to sample the periodogram at 
no more than a few frequencies, making it highly 
likely that we would miss most of the significant 
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periodicities in our data because of sparse sam- 
pling. 

The loss of a sizeable set of mutually indepen- 
dent frequencies puts us into a difficult, possibly 
intractable, position vis-a-vis the search for a theo- 
retical formula in closed form for false alarm prob- 
abilities. Were such a formula available, it would 
certainly be a great boon. Realistically however, 
it seems unlikely that such a formula could ever 
be found for the general case of arbitrarily spaced 
data. 

The only alternative to a theoretical false 
alarm function is an empir ically generated one. 
Schwarzenberg-Czernvl (|1998T ) expresses a distinct 
lack of confidence in this approach. He states, 
p 832, "We consider the opinion that all statis- 
tical problems related to the periodograms can 
be solved by Monte Carlo simulations to be over- 
optimistic" . His skepticism regarding simulations 
is due to the unreliability of random number 
generators. He says, p 832, "The simulations 
have problems of their own, related chiefly to the 
untested effects of the discrete random number 
generators and periodogram algorithms on the 
tails of the continuous distributions", and again 
on p 833, "The Monte Carlo simulations rely on 
rare events of low probability, for which neither 
the accuracy of random number generators nor 
the accuracy of periodogram algorithms is well 
tested" . 

The strong sentiments expressed by Schwarzenberg- 
Czerny offer little cheer to observers, whose prin- 
cipal need is a reliable method for assessing candi- 
date periodogram peaks. The current generation 
of theoretical distributions are all based on the 
assumption of independent frequencies and all re- 
quire a value of Ni. In the case of evenly spaced 
data, it might be argued that the correct value for 
Ni is [N /2], suitably reduced by the number of 
parameters already estimated from the data. For 
the general case however, even were we to believe 
the conjecture that independent frequencies exist, 
there appears to be no clear a priori theoretical 
criterion for choosing the value of Ni, and the only 
practical method offered is that of HB in which 
we fit some chosen theoretical distribution to the 
simulated CDF's. Necessity therefore forces us, 
against Schwarzenberg-Czerny's advice, into the 
route of Monte Carlo simulations. 

The realisation that the conjecture of the exis- 



tence of independent frequencies is false forces us 
to re-evaluate both the role of Monte Carlo simula- 
tions and the use of theoretical false alarm proba- 
bility functions. Schwarzenberg-Czerny's opinion 
regarding random number generators is not un- 
warranted. However, the performance of random 
number generators is continually being improved. 
There is every reason to believe therefore that ex- 
isting problems with random number generators 
will eventually be resolved. In contrast, the prob- 
lem of the lack of independent frequencies is per- 
manent. The Monte Carlo simulation option is 
therefore not as bleak as may first appear. As 
regards the use of theoretical false alarm proba- 
bility functions, we do not really need them. The 
empirically generated CDF's contain all the infor- 
mation that we need, whether or not we have a 
closed-form formula for them, and can be used to 
determine significance thresholds. A closed-form 
formula would be useful insofar as it facilitates cal- 
culation of the thresholds, but is not essential. If 
one is needed, we can resort to fitting the empiri- 
cal CDF as closely as possible by any suitable form 
of trial function. In fact, we do not need even 
to fit the entire CDF. We are interested only in 
the high-peak tail above a certain minimum con- 
fidence threshold and so need only obtain a good 
fit in that region. Should formulae be needed for 
other regions, we can resort to multiple fits that 
together cover the entire CDF. 

8. THE PROBLEM OF OVER-SAMPLING 

Theoretical false alarm probability functions 
are based on the assumption of the existence of in- 
dependent frequencies and contain the number N 
of frequencies inspected as a parameter. When the 
periodogram is inspected at the maximum number 
Ni of independent frequencies, N = N. For ex- 
ample, Scargle's function is given by 



Pr[Z >z] = l- F Zn ^(z) = 1 



(1-e-T (16) 



when N is the number of independent frequencies 
inspected. 

From Figure HI it is seen that, for given z, this 
probability increases as the number N of sam- 
pled independent frequencies is increased. Scar- 
gle, p 839, describes this property as the statisti- 
cal penalty that we must pay for inspecting a large 
number of frequencies. He explains this by saying 
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that "if N independent experiments are carried 
out, even if each one has a very small probability 
of succeeding, the chance of one of them succeed- 
ing is very large if N is large enough (approaching 
certainty as N approaches infinity)." He also notes 
that the expected value of the maximum power 
Zmax of a white noise spectrum over a set of N 
frequencies at which the power is independent is 
given by 



N 1 

^ k 

fe=l 



(17) 



which diverges logarithmically with N. 

These comments appear alarming. At face 
value, they seem to suggest that prodigious sam- 
pling of the periodogram at the independent fre- 
quencies might lead eventually to the dismissal of 
all periodogram peaks as spurious. They also ap- 
pear strongly to discourage over-sampling of the 
periodogram in an attempt to pin down more pre- 
cisely the frequency of a periodicity. Indeed, their 
effect has been so strong on some that they refuse 
to evaluate periodogram power at any frequencies 
other than a selected subset of the "natural fre- 
quencies" . Were these extreme conclusions drawn 
from Scargle's comments correct, the periodogram 
method for searching for periodicities would be 
severely compromised. 

To understand Scargle's comments correctly, we 
need first to note that the false alarm function is 
deduced assuming that we are able to identify N 
independent frequencies w^. An evenly sampled 
time series consisting of N data points guarantees 
the existence of at most N — [No/ 2] mutually in- 
dependent frequencies, namely the "natural" ones. 
There can be no more. The original data set can 
be fully recovered from the DFT at these frequen- 
cies. So the information contained in periodogram 
powers at all other frequencies cannot be indepen- 
dent of these. For a given evenly sampled time 
series consisting of N points, there is a maxi- 
mum value of N at which we can sample the peri- 
odogram independently, namely [Nq/2], and hence 
a maximum value of (Z mgx ). In practice therefore, 
there is no logarithmic divergence to fear. 

Second, if we over-sample the periodogram, the 
CDF is no longer correct. The powers at the sam- 
pled frequencies are no longer independent, and so 
equation (fTTj) ceases to be the correct description 



of the distribution. In these circumstances, it is 
not useful to look for a theoretical formula for the 
CDF. Even if it were mathematically tractable, it 
probably would not be worth the effort of obtain- 
ing an expression in closed form for it. The diffi- 
culties in obtaining a formula for this CDF, how- 
ever, do not prevent us from obtaining an excel- 
lent approximation to it through numerical exper- 
iments. The results of our simulations in this re- 
spect are encouraging. Successive over-samplings 
produce progressively less effect on the CDF, until 
it eventually converges to a limiting CDF beyond 
which no further refinement of the sampling grid 
changes the result. (See Figure O) With hind- 
sight, we should have expected this. The original 
time domain data contain a finite amount of in- 
formation. There is therefore a limit to how much 
information they can be forced to yield. 

Based on the numerical experiments described 
in this paper, we would therefore like to refine 
Scargle's lesson, drawn from a consideration of 
statistical penalties. The (gloomy) lesson he drew 
was: "If many frequencies are inspected for a spec- 
tral peak, expect to find a large peak power even 
if no signal is present" ( Scargld 1982 , p 840, col- 
umn 1). Our revision of Scargle's lesson is this: 
If many frequencies are inspected for a spectral 
peak, expect to find a large peak power even if no 
signal is present - but the total number of inde- 
pendent frequencies present in any given time se- 
ries is limited, so don't expect the number of large 
peaks produced by white noise to increase with- 
out limit. More importantly, over-sampling the 
periodogram does not dramatically increase 
the number of large peaks expected. 

9. A PRACTICAL METHOD FOR DE- 
TERMINING FALSE ALARM PROB- 
ABILITIES 

The theoretical false alarm probability func- 
tions extant in the literature all rely for their va- 
lidity on the existence of independent frequencies. 
Such a set is guaranteed for evenly spaced data, 
but not for data that are unevenly spaced. Even 
in the case where the data are evenly spaced, we 
may wish to inspect the periodogram at frequen- 
cies that do not coincide with Scargle's natural 
ones. Such is the case when a pronounced peak oc- 
curs at an intermediate frequency. How do we as- 
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Fig. 4. — Scargle false alarm probability function 
as a function of N for values N = 10, 50, 100, 200. 
As N increases, the probability of finding a peak 
above any given threshold value increases. This 
illustrates Scargle's 'statistical penalty': if many 
independent frequencies are inspected for a spec- 
tral peak, we should expect to find a large peak 
even when no signal is present. As N increases, 
the CDF moves progressively to larger peak-height 
values without limit. 
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Fig. 5. — Empirical CDF's as a function of over- 
sampling. The figure shows the CDF's corre- 
sponding to sampling at 1, 2, 3, 4, 5, and 10 
times the Scargle sampling rate. The correspond- 
ing CDF's converge rapidly to a limiting CDF. The 
limiting CDF coincides almost perfectly with the 
CDF for an over-sampling factor of 10. 



sess the significance of periodogram peaks in these 
cases? 

Based on our investigations described above, we 
suggest the following method: 

1. Using the sampling times of the actual data 
set to be analysed, construct a large number 
of pseudo-Gaussian random time series. 

2. Select a convenient grid of frequencies that 
cover the frequency range in the peri- 
odogram that is to be inspected. (We discuss 
how to choose these frequencies in the next 
paragraph. For the moment, assume that 
they have been selected.) 

3. Construct the periodogram for each pseudo- 
random time series, sampling it at each of 
the selected frequencies. 

4. In each periodogram, identify the highest 
periodogram power that occurs at the pre- 
selected frequencies only, and use these high- 
est values to construct the CDF of these 
highest power values. 

The CDF thus obtained is an empirically gener- 
ated graphical representation of the probability 
function Pr[Z max < z\. It gives the probability 
that pure noise alone could have produced power 
values less than or equal to a given threshold value 
z at each of the selected sampling frequencies. 

The plot of 1 Pr[Z max < z) is thus the 
required false alarm probability function. 
It gives the probability that pure noise 
alone could produce a peak at the in- 
spected frequencies of value higher than 
the threshold z. 

How do we choose the frequencies at which to 
sample the periodogram? In a sense, it makes lit- 
tle difference how we choose them since, once cho- 
sen, we generate an empirical false alarm probabil- 
ity function that is tailor-made for our particular 
choice. However, for each choice, there is a price 
to be paid, and the final decision on how to choose 
the sampling frequencies is determined by what we 
consider to be the best compromise between the 
price paid and the advantage gained. For a given 
false alarm probability pa, the denser the sam- 
pling, the higher the associated threshold z, with 
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the heaviest penalty being paid for over-sampling 
sufficiently dense as to produce a fully resolved pe- 
riodogram curve. In our simulations, this occurred 
at approximately five times the Scargle sampling 
rate, that is, using Aiv = ^(nN /T). (See Figure 
El) The sampling rate sufficient to guarantee con- 
vergence to the limiting CDF must be established 
individually for each data set. This can be done 
using plots like those shown in Figures [6] and [7J 

If we are interested in pinpointing precisely the 
frequency of a peak (as we are in asteroseismol- 
ogy), then gross over-sampling may be the route 
to follow. However, there is a limit to the amount 
of information contained in the periodogram of a 
finite time series. There is therefore also a limit 
to how finely the frequency axis should be sub- 
divided. This limit is given by Aw m ; n = ir/T, 
which is the smallest frequency interval that can 
reasonably be resolved by the data set. Dense 
over-sampling in pursuit of the convergence limit 
of the CDF may lead to a choice of Aw smaller 
than this interval. If the limiting CDF differs sub- 
stantially from that obtained from Aw m i n , then 
limiting the sampling interval to Aw m j n may be a 
better option. 

As noted previously, ISchwarzenberg-Czernvl 
(|l998l ) has little confidence in this method. Apart 
from the comments already reported, he further 
says, p 832, "Experiments often demonstrate dif- 
ficulties in the reproduction of theoretical single- 
trial distributions by simulations. Hence the ana- 
lytical single-trial probabilities discussed here are 
essential for the verification of Monte Carlo simu- 
lations" . He has a similar comment on p 833 from 
which he draws the conclusion that "single-trial 
analytical probability distributions are indispens- 
able in any strategy for bandwidth correction" . 
It is not clear how Schwarzenberg-Czerny intends 
the phrase "single-trial probabilities" to be un- 
derstood, but however we interpret it, these com- 
ments leave us with the same dilemma. It seems 
to us that our only recourse in assessing peri- 
odogram peak significance in the general case of 
unevenly spaced data is Monte Carlo simulation. 
Though the problems with Monte Carlo simula- 
tions pointed out by Schwarzenberg-Czerny are 
real, they are not problems of principle, but of 
practical implementation. They therefore can, 
and will, be overcome in time, if they have not 
been overcome already. 
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Fig. 6. — Logarithmic plot of the sum of square 
deviations (from the limiting CDF) of the CDF 
for v times over-sampling vs. the over-sampling 
factor v. The convergence to the limiting curve is 
seen to be very rapid. For the data set used in this 
simulation, the convergence occurs approximately 
at an over-sampling factor of 5. The convergence 
shows up in this plot as a sharp levelling off of the 
graph. 
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Fig. 7. — Log-log plot of the sum of square devi- 
ations (from the limiting CDF) of the CDF for v 
times over-sampling vs. the over-sampling rate v. 
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10. SUMMARY AND CONCLUSIONS 

Currently available theoretical false alarm 
probability functions are all derived from what 
appear to be reasonable assumptions about the 
data to be analysed and about the periodograms 
that they yield. Their validity, reliability and 
usefulness therefore strongly depend on how well 
these assumptions are met in practice. 

A key assumption made by all authors is that 
the frequency range inspected in the periodogram 
contains a set of Ni frequencies Wf. : k = 1, Ni 
at which the periodogram powers Px{^k) are mu- 
tually independent. In theoretical statistical anal- 
ysis, we have little hope of obtaining a false alarm 
probability function in the absence of this assump- 
tion. Without independence, very few general sta- 
tistical results are available, and none are relevant 
to the problem at hand. The assumption of the ex- 
istence of independent frequencies is therefore nec- 
essary in any theoretical discussion of the problem 
of significance of periodogram peaks and poses the 
first and most important obstruction to its resolu- 
tion. 

The existence of independent frequencies is 
guaranteed when the data are evenly spaced. 
We should therefore be able to test the valid- 
ity of proposed false alarm probability functions 
for this case against the results of Monte Carlo 
simulations. Reasonable requirements on candi- 
date functions include a good fit to the empirical 
CDF's, and their ability to predict correctly the 
number of independent frequencies known to exist 
from the theory. 

Though they seem not to have viewed their 
work in this light, HB effectively performed this 
test for Scargle's false alarm function. They 
constructed the empirical CDF for periodogram 
peak heights produced by a pure noise time se- 
ries consisting of No evenly spaced data points, 
and fitted the Scargle false alarm function to it 
by least squares using the number Ni of inde- 
pendent frequencies as the fitting parameter. Ac- 
cording to the theory, they should have obtained 
N < [Ao/2]. However, their results consistently 
yielded Ni > Nq. Home and Baliunas did not 
comment on this anomaly. 

We have repeated their simulations, obtain- 
ing results similar to theirs only for gross over- 
sampling of the periodogram, and only for data 



sets with N Q < 170 data points. For gross over- 
sampling and data sets with Nq > 170, we were 
unable to reproduce their results. The values of 
N obtained by HB are consistently and system- 
atically larger than ours. In our simulations, the 
best- fit value of Ni increases linearly with No, in 
conflict with the quadratic dependence claimed by 
HB. Inspection of a plot of the values published 
in HB appears to indicate that their points lie 
on two straight lines, with a disjunction of slope 
at 7V = 170 data points. We conjecture from 
these results that HB constructed their empirical 
CDF's by gross over-sampling of the periodogram. 
This might explain why they consistently obtained 
Ni > No- We also conjecture that the sharp dis- 
junction in slope at No = 170, which is not ob- 
served in our simulations, is due to a systematic 
error in theirs. If so, the quadratic dependence 
of Ni on 7V , sometimes exploited by astronomers 
in the analysis of their data, might not be a real 
feature of real astronomical data but rather a spu- 
rious artifact of the HB simulations. 

Given the assumption of independence that lies 
at the heart of Scargle's derivation of his false 
alarm probability function, it seemed unreason- 
able to suppose that it would provide an adequate 
description of the empirical CDF's obtained by 
over-sampling the periodograms. Accordingly, we 
initially ran the HB simulations for CDF's con- 
structed by sampling the periodograms only at 
the natural frequencies. The best-fit values of 
Ni were very close to the theoretically expected 
value of [JVo/2]. Though heartening, the results 
of these simulations displayed a disconcerting fea- 
ture: the best-fit Scargle functions were very poor 
fits to the empirical CDF's, displaying large de- 
viations from the empirical CDF's in the domain 
of most interest when assessing the significance of 
periodogram peaks. The theoretical false alarm 
functions were consistently substantially higher in 
value than the empirical CDF's, leading to severe 
under-estimation of peak significance. This same 
behaviour was observed for the best-fit curves to 
the CDF's constructed by over-sampling the pe- 
riodograms. Researchers using the Scargle false 
alarm function, with or without the HB algorithm, 
are thus at significant risk of rejecting peaks that 
reflect real periodicities in their data. 

A flaw in Scargle's derivat ion of his fal se alarm 
function was pointed out by iKoenl (|l990h and by 
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Schwarzenberg-Czernv (1996)): Scargle assumes 



that the variance a\ of the noise is known a 
priori. This condition is not satisfied either in 
the simulations (where we sample pure pseudo- 
Gaussian noise), nor in real data sets (where 
the data variance must be estimated from the 
data themselves). Both Koen and Schwarzenberg- 
Czerny correct this error in thei r respe ctive treat- 
ments of the problem. iKoenl (|l99flh concludes 
that Scargle's false alarm function should be re- 
placed by the Fisher (or, Fisher-Sned ecor) distri- 
bution. Schwarzenberg-Czernvl ( 199^ pointed out 
that the Fisher distribution is applicable only for 
ratios of independent random variables. With ra- 
tios of random variables that are not independent, 
the Fisher distribution must be replaced by the 
incomplete (3- function. He also showed that, in 
the case of the Lomb-Scargle periodogram, the 
correct distribution is given by the incomplete 
beta function. On the strength of the work of 
these authors, we tested Schwarzenberg-Czerny's 
proposed function on CDF's constructed by over- 
sampling periodograms and also on CDF's con- 
structed by sampling only at the natural frequen- 
cies. In both cases, we have found the best-fit 
Schwarzenberg-Czerny function, obtained by the 
HB algorithm, consistently to fit the empirical 
CDF's far more closely than Scargle's function, 
with impressively good agreement on all but the 
smallest data sets, where the theoretical function 
deviates only slightly from the empirical CDF's. 

In spite of the excellent fits provided by the 
Schwarzenberg-Czerny false alarm function, our 
simulations display an alarming feature: the best- 
fit values of Ni that yield such excellent agree- 
ment with the empirical CDF's are all consistently 
higher than the theoretically expected value of 
[Nq/2]. This is not unexpected for CDF's con- 
structed by over-sampling. In the case of CDF's 
constructed by sampling only at the natural fre- 
quencies, however, this result is in conflict with 
the theory. This means that, as in the case of the 
Scargle function, we cannot interpret the best-fit 
value of the parameter Ni as the number of inde- 
pendent frequencies. It must be regarded rather 
as a fitting parameter in a one-parameter family 
of candidate CDF functions that fit the empiri- 
cal CDF's better than Scargle's candidate func- 
tions. Note that the Schwarzenberg-Czerny false 
alarm functions constructed independently of sim- 



ulations, relying exclusively on the use of a priori 
theoretical values for Ni badly overestimate the 
significance of periodogram peaks and may result 
in the acceptance of spurious peaks as genuine. It 
would seem therefore that unqualified confidence 
in analytical single-trial probability distributions 
in the construction of false alarm probability func- 
tions may be misplaced. Even in those cases where 
they ought to provide a good description of the be- 
haviour of the empirical CDF's, they apparently 
fail to do so, leaving us no option but to resort to 
Monte Carlo simulations and to treat the theoret- 
ical distributions as nothing more than candidate 
CDF functions to be accepted or rejected accord- 
ing to their utility in providing a good fit to the 
empirical curves in the region of interest. 

Ultimately, our principal interest is in the case 
of unevenly spaced data, not data that are evenly 
spaced. The loss of independence of the variables 
Px(<-o) in this case calls into question the validity 
and the expediency of searching for a formula in 
closed form for a false alarm probability function. 
All formulae proposed hitherto are based on the 
assumption of the existence of a set of mutually 
independent periodogram powers. This assump- 
tion is not r ealistic in un even sampling schemes, 
as shown by IKoenl (|l990T ). Were a set of approx- 
imately uncorrelated periodogram powers to be 
found, in the sense outlined by Scargle, this still 
would not guarantee their approximate indepen- 
dence. The currently proposed closed-form for- 
mulae therefore cannot be expected to provide ac- 
curate false alarm criteria. Even if we adopt the 
attitude that the proposed formulae are no more 
than candidate CDF functions, the value of the pa- 
rameter iVj is not known a priori, independently 
of Monte Carlo simulations. Therefore, theoreti- 
cal probability distributions provide no predictive 
power in determining false alarm criteria appro- 
priate to a given data set which is independent 
of the empirical CDF's generated by simulations. 
In the final analysis, the only way to obtain the 
appropriate false alarm probability function is by 
first constructing empirical CDF's for the maxi- 
mum peak heights by using Monte Carlo methods, 
and then fitting these distributions with the false 
alarm function of choice. If a sufficiently good fit 
is obtained, the fitted function can then be used to 
calculate the significance levels for the given data 
set. If the fit is not good however, the significance 
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levels predicted by these fitted functions are likely 
to lead to erroneous rejection or acceptance of pe- 
riodogram peaks, making them almost useless in 
the assessment of the significance of peaks. 

At first, this dilemma appears irresolvable. On 
reflection, however, its resolution is staring us in 
the face. What we need is a reliable false alarm 
probability function. Though we do not possess 
this function as a closed-form formula, we never- 
theless have a numerical plot of it in the form of 
the CDF of maximum peak heights. This plot can 
be used just as easily as any closed form formula to 
get the answers that we want. If we insist on hav- 
ing a closed-form formula to facilitate significance 
estimation, the empirical CDF can be fitted in 
the region of interest by any number of candidate 
fitting-functions. This renders the need to search 
for a theoretical formula obsolete. Of course, it 
would be nicer, more convenient and more satisfy- 
ing to have a theoretical formula, but a numerical 
plot of the same function is almost as good. 

In this paper we have studied almost exclu- 
sively significance tests for the rejection of spu- 
rious peaks in periodograms of data sets that are 
evenly-spaced in time. An analogous study for 
unevenly-spaced data sets is currently in prepara- 
tion. 

We sincerely thank Mike Gaylard, Chris Koen 
and Melvyn Varughesc for their critical reading of 
draft versions of the manuscript. We also thank 
the South African SKA Office in Johannesburg for 
use of their facilities. 
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A. CLASSICAL PERIODOGRAM 



The classical periodogram is essentially a Fourier power spectrum estimator for an infinite continuous-time 
signal X(t) that has been discretely sampled for a finite time at equally spaced time intervals. The data 
for this estimator form a finite discrete time series consisting of N values Xj = X(t{), i = 1,...,N, of the 
physical parameter X at times U = to, to + At, to + 2At, to + (N — l)At. The discrete Fourier transform 
(DFT), DFTx(u), of this time series X i: which is defined by 



N 



DFT x {uj)=Y, X(t r )e- 



(Al) 



may be regarded as an estimator of the Fourier transform FTx{t) of X(t). The power spectral density of 
the signal may then be estimated by the function 

\DFT x (uo)\ 2 

with some suitably chosen normalising coefficient. A commonly used normalisation is 

2 



CPx{u) = \DFT x (u)\ 2 = 1 



iV 



(A2) 



A simple calculation then yields the formula, 

N 



CPx{lo) = i 



COS LOt r 



X(t r ) sincji r ) 

\r=i / 



(A3) 



Following IScargld <|l982h . we c all this function the classical periodogram. This definition agrees with that 
given originally by Schuste r in Schuster] ( 18981 ) . b ut not wi t h tha t in his later publications. It also agrees 
with th e definitio n s used in Thompson! (|l97lf) and Deeminj ( 19751 ). and differs by a factor of two from that 
used bv lPriestlevi (|l98ll ). 

It is easy to see from equation (|A2p why the classical periodogram is useful in identifying the frequencies 
of harmonic components in the signal X. Suppose X contains an harmonic component of frequency Cj. Then, 
when u> is very different from Q, X(t) and e -4 "' are out of phase, and the product X(t)e~' lujt oscillates rapidly. 
The sum of the products X(t r )e~ KOtr , which is a discrete estimator of the integral J X(t)e~ luJt dt will thus 
have a value close to zero, albeit masked by whatever other signal is present in X(t). As u> approaches the 
value of ui, the factors X(t) and e~ luJt get closer in phase, so the product X(t)e~ luJt oscillates more slowly. 
The value of the sum of the products X(t r )e~ luJtr will thus rise, reaching a maximum at u> = ui. The presence 
of a harmonic signal of frequency Q thus produces a peak in the periodogram with maximum at ui. 

The converse however is not true. A peak in the periodogram does not necessarily reflect the presence of 
an harmonic component in the signal X. Peaks might be produced by other effects. Thus, the presence of 
measurement error, signal noise, or random physical processes in the observed system might, by a spurious 
random fluctuation, also produce a peak. Peaks may also be produced by aliasing and / or spectral leakage, 
and the observing window. The potential for producing peaks that are not due to harmonic components in 
the observed signal makes the interpretation of peaks in the periodogram very difficult and presents many 
hazards and pitfalls for the unwary. The dangers posed by these effects were already noted by Arthur 
Schuster as early as 1906, "... it has generally been assumed that each maximum in the amplitude of a 
harmonic term corresponded to a true periodicity. The extent to which this fall acious reasoning has been 
made use of would surprise anyone not familiar with the literature of the subject." ( Schuster (1906), p 71-72). 
Strangely, his warning has often been ignored, and sometimes even disdainfully brushed aside. 
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B. LOMB-SCARGLE PERIODOGRAM 

Following Scarg lc (1982), Appendix B, we define the Lomb-Scargle periodogram by the formula 



N 



Xi COS0j(ti — t) 



N 



Xi sin u)(t j — t ) 



i=i 



2 \ 



jV 



JV 



cos 2 uj(ti - t) 



sin 2 cj(tj — r) 



(Bl) 



where the epoch translation parameter t(u>) is defined implicitly by the formula 



N 



tan(2ojr) 



sin(2o;ii) 
t=i 

~~N 

^ cos(2wti) 

i=l 



(B2) 



The data used to calculate Px{(jS) form a finite discrete time series consisting of N values Xi = X(ti), 
i = 1, ..., N, of the phy sical para meter X measure d at times \tAi = 1, 2, ...,N} which are arbitrarily spaced 
in time. iLombl ( 19761), following iBarningl (jl963) and IVanicek (1963), arrived at this formula via a least 
squares fitting procedure in which sampled values X(ti) are fitted with an harmonic signal of frequency u). 
For these three authors therefore, Px(tu) does not represent and attempt at estimating the Fourier power 
spectrum of any continuous time physical signal X(t), but is rather a spectral best- fit parameter that displays 
how closely the data may be fitted with a single harmonic function of frequency u>. The larger the value of 
Px(w), the better the fit. 

In contrast with these authors, Scargld (1982) arrived at the same spec tral function b y first relaxing the 
definition of the DFT for application to the case of unevenly spaced data (|Scargleill982L Appendix A), and 
then imposing two demands on the periodogram (which he calls the modified, or generalised periodogram) 
calculated from this relaxed definition: the statistical distribution of the generalised periodogram will be 
made as closely as possible the same as it is in the evenly spaced case; and, the generalised periodogram will 
be made invariant to translations in time. These two requirements yield uniquely the formulae in equations 
(|B1|) and (|B2jl . Arguably, this restores the interpretation of the modified periodogram as an estimator of 
the power spectrum of the physical signal X(t) in the case where the signal is unevenly sampled in time. 
However, it is probably more accurate to regard it as a spectral goodness-of-fit parameter. This view also 
enables one better to understand a variety of other, alternative, periodogram formulae currently offered in 
the literature. 



C. Pure Noise 

A random process X(t) is said to be a purely random process, pure noise, or white noise, if it consists of 
a sequence of uncorrelated random variables. This means that, for all t' ^ t, 

cov(X(t),X(t')) = 

Pure noise is the simplest of all random process models. It corresponds to a case where the process has 
"no memory" in the sense that the value of the random variable X(t) at time t has no relation whatever 
to its value X(t') at any other time t', no matter how close or distant t and t' are to each other. In this 
sense, X(t) neither remembers its past, nor is aware of its future. Knowing the value of X(to) at any time 
£o therefore provides no way at all, other than by the probability distribution Px(t) — p(x, t), of predicting 
within reasonable limits and uncertainties the value of X(t) at time t. This is to be contrasted with correlated 
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noise where the values X(i) and X(t') are in general related or 'correlated'. In this case, we can do better in 
predicting the value of X(t + r) from X(t) than in the case of uncorrelated, or pure, noise. From a knowledge 
of the value X(t), we can s et narrower li mits on the probable values of X(t + r) than is possible from the 
distribution Pxit+r) alone. ({Priest le vl 1 1 98 ll p 114). 

Pure noise is said to be Gaussian if the random variables X(t) are jointly normally distributed. Noise of 
this kind is often called Gaussian white noise. In this case, the random variables {X(t)} are also mutually 
independent. 

Note that some authors dehnc pure noise more stringently. For them, a random process X(t) is pure noise 
if the random variables {X(t)} are independent, and identically distributed with zero mean. 

In this paper, a data set {Xk\k = 1,2, ...,No} is said to be pure noise if the values Xk are independent, 
and identically distributed random variables with zero mean. For simplicity, we assume also that the Xk 
are each normally distributed. Denote their common variance by a x . Since the Xk have zero mean, their 
covariance matrix is given by 

C 3 k = E[{X 3 - » Xj ){X k - n Xk )] = E[XjX k ] = a x 6 jk (CI) 
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